Search | WHO COVID-19 Research Database

1.

Mixup-Inf-Net: A data augmentation algorithm for segmentation of new coronary pneumonia infections

Zeng, K.; Pei, X.; Chen, P..

Proceedings of SPIE - The International Society for Optical Engineering ; 12626, 2023.

Article in English | Scopus | ID: covidwho-20245242

ABSTRACT

In 2020, the global spread of Coronavirus Disease 2019 exposed entire world to a severe health crisis. This has limited fast and accurate screening of suspected cases due to equipment shortages and and harsh testing environments. The current diagnosis of suspected cases has benefited greatly from the use of radiographic brain imaging, also including X-ray and scintigraphy, as a crucial addition to screening tests for new coronary pneumonia disease. However, it is impractical to gather enormous volumes of data quickly, which makes it difficult for depth models to be trained. To solve these problems, we obtained a new dataset by data augmentation Mixup method for the used chest CT slices. It uses lung infection segmentation (Inf-Net [1]) in a deep network and adds a learning framework with semi-supervised to form a Mixup-Inf-Net semi-supervised learning framework model to identify COVID-19 infection area from chest CT slices. The system depends primarily on unlabeled data and merely a minimal amount of annotated data is required;therefore, the unlabeled data generated by Mixup provides good assistance. Our framework can be used to improve improve learning and performance. The SemiSeg dataset and the actual 3D CT images that we produced are used in a variety of tests, and the analysis shows that Mixup-Inf-Net semi-supervised outperforms most SOTA segmentation models learning framework model in this study, which also enhances segmentation performance. © 2023 SPIE.

2.

Semi-Supervised Modified-UNet for Lung Infection Image Segmentation

Upadhyay, A. K.; Bhandari, A. K..

IEEE Transactions on Radiation and Plasma Medical Sciences ; : 1-1, 2023.

Article in English | Scopus | ID: covidwho-20244069

ABSTRACT

Automatic lung infection segmentation in computed tomography (CT) scans can offer great assistance in radiological diagnosis by improving accuracy and reducing time required for diagnosis. The biggest challenges for deep learning (DL) models in segmenting infection region are the high variances in infection characteristics, fuzzy boundaries between infected and normal tissues, and the troubles in getting large number of annotated data for training. To resolve such issues, we propose a Modified U-Net (Mod-UNet) model with minor architectural changes and significant modifications in the training process of vanilla 2D UNet. As part of these modifications, we updated the loss function, optimization function, and regularization methods, added a learning rate scheduler and applied advanced data augmentation techniques. Segmentation results on two Covid-19 Lung CT segmentation datasets show that the performance of Mod-UNet is considerably better than the baseline U-Net. Furthermore, to mitigate the issue of lack of annotated data, the Mod-UNet is used in a semi-supervised framework (Semi-Mod-UNet) which works on a random sampling approach to progressively enlarge the training dataset from a large pool of unannotated CT slices. Exhaustive experiments on the two Covid-19 CT segmentation datasets and on a real lung CT volume show that the Mod-UNet and Semi-Mod-UNet significantly outperform other state-of-theart approaches in automated lung infection segmentation. IEEE

3.

Dual Consistency-enhanced Semi-supervised Sentiment Analysis towards COVID-19 Tweets

Sun, T.; Jing, L.; Wei, Y.; Song, X.; Cheng, Z.; Nie, L..

IEEE Transactions on Knowledge and Data Engineering ; : 1-13, 2023.

Article in English | Scopus | ID: covidwho-20243432

ABSTRACT

In the context of COVID-19, numerous people present their opinions through social networks. It is thus highly desired to conduct sentiment analysis towards COVID-19 tweets to learn the public's attitudes, and facilitate the government to make proper guidelines for avoiding the social unrest. Although many efforts have studied the text-based sentiment classification from various domains (e.g., delivery and shopping reviews), it is hard to directly use these classifiers for the sentiment analysis towards COVID-19 tweets due to the domain gap. In fact, developing the sentiment classifier for COVID-19 tweets is mainly challenged by the limited annotated training dataset, as well as the diverse and informal expressions of user-generated posts. To address these challenges, we construct a large-scale COVID-19 dataset from Weibo and propose a dual COnsistency-enhanced semi-superVIseD network for Sentiment Anlaysis (COVID-SA). In particular, we first introduce a knowledge-based augmentation method to augment data and enhance the model's robustness. We then employ BERT as the text encoder backbone for both labeled data, unlabeled data, and augmented data. Moreover, we propose a dual consistency (i.e., label-oriented consistency and instance-oriented consistency) regularization to promote the model performance. Extensive experiments on our self-constructed dataset and three public datasets show the superiority of COVID-SA over state-of-the-art baselines on various applications. IEEE

4.

DBF-Net: a semi-supervised dual-task balanced fusion network for segmenting infected regions from lung CT images.

Lu, Xiaoyan; Xu, Yang; Yuan, Wenhao.

Evol Syst (Berl) ; 14(3): 519-532, 2023.

Article in English | MEDLINE | ID: covidwho-2316744

ABSTRACT

Accurate segmentation of infected regions in lung computed tomography (CT) images is essential to improve the timeliness and effectiveness of treatment for coronavirus disease 2019 (COVID-19). However, the main difficulties in developing of lung lesion segmentation in COVID-19 are still the fuzzy boundary of the lung-infected region, the low contrast between the infected region and the normal trend region, and the difficulty in obtaining labeled data. To this end, we propose a novel dual-task consistent network framework that uses multiple inputs to continuously learn and extract lung infection region features, which is used to generate reliable label images (pseudo-labels) and expand the dataset. Specifically, we periodically feed multiple sets of raw and data-enhanced images into two trunk branches of the network; the characteristics of the lung infection region are extracted by a lightweight double convolution (LDC) module and fusiform equilibrium fusion pyramid (FEFP) convolution in the backbone. According to the learned features, the infected regions are segmented, and pseudo-labels are made based on the semi-supervised learning strategy, which effectively alleviates the semi-supervised problem of unlabeled data. Our proposed semi-supervised dual-task balanced fusion network (DBF-Net) creates pseudo-labels on the COVID-SemiSeg dataset and the COVID-19 CT segmentation dataset. Furthermore, we perform lung infection segmentation on the DBF-Net model, with a segmentation sensitivity of 70.6% and specificity of 92.8%. The results of the investigation indicate that the proposed network greatly enhances the segmentation ability of COVID-19 infection.

5.

Aspect-Based Sentiment Analysis with Semi-Supervised Approach on Taiwan Social Distancing App User Reviews

Nuha, U.; Lin, C. H..

5th International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2023 ; : 444-447, 2023.

Article in English | Scopus | ID: covidwho-2306891

ABSTRACT

Sentiment analysis has a critical role to reveal an opinion in a text-based form. Therefore, we exploit this analysis to discover the sentiment polarity of Taiwan Social Distancing mobile application. This paper proposes a semi-supervised scheme for annotating this mobile application's reviews. The semi-supervised scheme utilized a combination of numeric rating and lexicon-based sentiment. In addition, we also perform the sentiment analysis on an aspect-based level. Based on the experiment, we decide to select three aspects to be analyzed. This paper also evaluates the proposed scheme by implementing bidirectional encoder representations from transformers (BERT) and multilayer perceptron (MLP) as the classification model using the sentiment label of the proposed scheme. The result shows that the annotation of the proposed scheme outperforms the data annotation using counterpart models. © 2023 IEEE.

6.

Semi-Supervised Machine Learning for Analyzing COVID-19 Related Twitter Data for Asian Hate Speech

Richardson, C.; Shah, S.; Yuan, X..

21st IEEE International Conference on Machine Learning and Applications, ICMLA 2022 ; : 1643-1648, 2022.

Article in English | Scopus | ID: covidwho-2302528

ABSTRACT

The COVID-19 pandemic left a lot of people sick, tired, and frustrated. Many people expressed their feelings on social media through comments and posts. Detecting hate speech on social media is important to help reduce the spread of racist comments. Machine learning algorithms can be used to classify hate speech. In our experiments, we implement semi-supervised machine learning algorithms to classify Twitter data. We used a count vectorizer as the feature and a support vector machine (SVM) classifier to classify COVID-19 related Twitter data while changing the amount of labeled data available. We found that self-training semi-supervised machine learning has similar effectiveness to supervised learning when there is significantly less training data available. © 2022 IEEE.

7.

Computer-Aided System for COVID-19 Using Semi-supervised-based Ensemble Learning and Reinforcement Learning

Liu, M.; Yuan, Y.; Yang, M.; Pu, H.; Wang, X.; Liu, M..

8th IEEE International Conference on Computer and Communications, ICCC 2022 ; : 2334-2338, 2022.

Article in English | Scopus | ID: covidwho-2298980

ABSTRACT

Coronavirus Disease 2019(COVID-19) has shocked the world with its rapid spread and enormous threat to life and has continued up to the present. In this paper, a computer-aided system is proposed to detect infections and predict the disease progression of COVID-19. A high-quality CT scan database labeled with time-stamps and clinicopathologic variables is constructed to provide data support. To our knowledge, it is the only database with time relevance in the community. An object detection model is then trained to annotate infected regions. Using those regions, we detect the infections using a model with semi-supervised-based ensemble learning and predict the disease progression depending on reinforcement learning. We achieve an mAP of 0.92 for object detection. The accuracy for detecting infections is 98.46%, with a sensitivity of 97.68%, a specificity of 99.24%, and an AUC of 0.987. Significantly, the accuracy of predicting disease progression is 90.32% according to the timeline. It is a state-of-the-art result and can be used for clinical usage. © 2022 IEEE.

8.

Semi-Supervised KPCA-Based Monitoring Techniques for Detecting COVID-19 Infection through Blood Tests.

Harrou, Fouzi; Dairi, Abdelkader; Dorbane, Abdelhakim; Kadri, Farid; Sun, Ying.

Diagnostics (Basel) ; 13(8)2023 Apr 18.

Article in English | MEDLINE | ID: covidwho-2296206

ABSTRACT

This study introduces a new method for identifying COVID-19 infections using blood test data as part of an anomaly detection problem by combining the kernel principal component analysis (KPCA) and one-class support vector machine (OCSVM). This approach aims to differentiate healthy individuals from those infected with COVID-19 using blood test samples. The KPCA model is used to identify nonlinear patterns in the data, and the OCSVM is used to detect abnormal features. This approach is semi-supervised as it uses unlabeled data during training and only requires data from healthy cases. The method's performance was tested using two sets of blood test samples from hospitals in Brazil and Italy. Compared to other semi-supervised models, such as KPCA-based isolation forest (iForest), local outlier factor (LOF), elliptical envelope (EE) schemes, independent component analysis (ICA), and PCA-based OCSVM, the proposed KPCA-OSVM approach achieved enhanced discrimination performance for detecting potential COVID-19 infections. For the two COVID-19 blood test datasets that were considered, the proposed approach attained an AUC (area under the receiver operating characteristic curve) of 0.99, indicating a high accuracy level in distinguishing between positive and negative samples based on the test results. The study suggests that this approach is a promising solution for detecting COVID-19 infections without labeled data.

9.

Fraud Detection Model Using Semi-supervised Learning

Priya,; Sharma, K..

11th International Conference on Soft Computing for Problem Solving, SocProS 2022 ; 547:395-406, 2023.

Article in English | Scopus | ID: covidwho-2277017

ABSTRACT

Everything is moving to online platforms in this digital age. The frauds connected to this are likewise rising quickly. After COVID, the amount of fraudulent transactions increased, making this a very essential area of research. This study intends to develop a fraud detection model using machine learning's semi-supervised approach. It combines supervised and unsupervised learning methods and is far more practical than the other two. A bank fraud detection model utilizing the Laplacian model of semi-supervised learning is created. To determine the optimal model, the parameters were adjusted over a wide range of values. This model's strength is that it can handle a big volume of unlabeled data with ease. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

10.

Identifying Similar Questions in the Medical Domain Using a Fine-tuned Siamese-BERT Model

Merchant, A.; Shenoy, N.; Bharali, A.; Anand Kumar, M..

19th IEEE India Council International Conference, INDICON 2022 ; 2022.

Article in English | Scopus | ID: covidwho-2271937

ABSTRACT

A large number of people search about their health related problems on the web. However, the number of sites with qualified and verified people answering their queries is quite low in comparison to the number of questions being put up. The rate of queries being searched on such sites has further increased due to the COVID-19 pandemic. The main reason people find it difficult to find solutions to their queries is due to ineffective identification of semantically similar questions in the medical domain. For most cases, answers to the queries people ask would be present, the only caveat being the question may be present in a different form than the one asked by the particular user. In this research, we propose a Siamese-based BERT model to detect similar questions using a fine-tuning approach. The network is fine-tuned with medical question-answer pairs and then with question-question pairs to get a better question similarity prediction. © 2022 IEEE.

11.

A Semi-Supervised Learning Using Tri-Classifier Model with Voting for COVID-19 Cough Classification

Chen, Yuh-Shyan; Cheng, Kuang-Hung; Hsu, Chih-Shun; Lin, Tzu-Hung.

International Journal of Pattern Recognition & Artificial Intelligence ; : 1, 2023.

Article in English | Academic Search Complete | ID: covidwho-2254666

ABSTRACT

Due to the increasing severity of the COVID-19 pandemic, timely screening and diagnosis of infections are essential. Since cough is a common symptom of COVID-19, an AI-assisted cough classification scheme is designed in this paper to diagnose COVID-19 infection. To reduce the labeling efforts by human experts, a semi-supervised learning with voting scheme using a triple-classifier model is proposed for the COVID-19 cough classification. This work aims to improve the accuracy of the classification. Initially, the data pre-processing scheme is executed by performing data cleaning, resampling, and data enhancement so as to improve the audio quality before training. The pre-training scheme is then performed by using a few numbers of COVID-19 cough data with labeling. Then we modify a well-known self-supervised learning model, SimCLR, to a semi-supervised learning-based SimCLR-like model, which uses three different loss functions to fine-tune three training models for cough classification. Finally, a voting scheme is performed based on the classification results of the three cough classifiers so as to enhance the accuracy of the cough classification for COVID-19. The experiment results illustrate that the proposed scheme can achieve 85% accuracy, which outperforms the existing semi-supervised learning-based classification schemes. [ FROM AUTHOR] Copyright of International Journal of Pattern Recognition & Artificial Intelligence is the property of World Scientific Publishing Company and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full . (Copyright applies to all s.)

12.

Improving Semi-supervised Deep Learning by using Automatic Thresholding to Deal with Out of Distribution Data for COVID-19 Detection using Chest X-ray Images

Benavides-Mata, I.; Calderon-Ramirez, S..

4th IEEE International Conference on BioInspired Processing, BIP 2022 ; 2022.

Article in English | Scopus | ID: covidwho-2251797

ABSTRACT

Semi-supervised learning (SSL) leverages both labeled and unlabeled data for training models when the labeled data is limited and the unlabeled data is vast. Frequently, the unlabeled data is more widely available than the labeled data, hence this data is used to improve the level of generalization of a model when the labeled data is scarce. However, in real-world settings unlabeled data might depict a different distribution than the labeled dataset distribution. This is known as distribution mismatch. Such problem generally occurs when the source of unlabeled data is different from the labeled data. For instance, in the medical imaging domain, when training a COVID-19 detector using chest X-ray images, different unlabeled datasets sampled from different hospitals might be used. In this work, we propose an automatic thresholding method to filter out-of-distribution data in the unlabeled dataset. We use the Mahalanobis distance between the labeled and unlabeled datasets using the feature space built by a pre-trained Image-net Feature Extractor (FE) to score each unlabeled observation. We test two simple automatic thresholding methods in the context of training a COVID-19 detector using chest X-ray images. The tested methods provide an automatic manner to define what unlabeled data to preserve when training a semi-supervised deep learning architecture. © 2022 IEEE.

13.

A Review of Deep Learning Imaging Diagnostic Methods for COVID-19

Zhou, T.; Liu, F.; Lu, H.; Peng, C.; Ye, X..

Electronics (Switzerland) ; 12(5), 2023.

Article in English | Scopus | ID: covidwho-2288968

ABSTRACT

COVID-19 (coronavirus disease 2019) is a new viral infection disease that is widely spread worldwide. Deep learning plays an important role in COVID-19 images diagnosis. This paper reviews the recent progress of deep learning in COVID-19 images applications from five aspects;Firstly, 33 COVID-19 datasets and data enhancement methods are introduced;Secondly, COVID-19 classification methods based on supervised learning are summarized from four aspects of VGG, ResNet, DenseNet and Lightweight Networks. The COVID-19 segmentation methods based on supervised learning are summarized from four aspects of attention mechanism, multiscale mechanism, residual connectivity mechanism, and dense connectivity mechanism;Thirdly, the application of deep learning in semi-supervised COVID-19 images diagnosis in terms of consistency regularization methods and self-training methods. Fourthly, the application of deep learning in unsupervised COVID-19 diagnosis in terms of autoencoder methods and unsupervised generative adversarial methods. Moreover, the challenges and future work of COVID-19 images diagnostic methods in the field of deep learning are summarized. This paper reviews the latest research status of COVID-19 images diagnosis in deep learning, which is of positive significance to the detection of COVID-19. © 2023 by the authors.

14.

Aggregate Learning for Mixed Frequency Data

Toda, T.; Moriwaki, D.; Ota, K..

2022 IEEE International Conference on Big Data, Big Data 2022 ; : 4157-4165, 2022.

Article in English | Scopus | ID: covidwho-2284210

ABSTRACT

Large and acute economic shocks such as the 2007-2009 financial crisis and the current COVID-19 infections rapidly change the economic environment. In such a situation, real-time analysis of regional heterogeneity of economic conditions using alternative data is essential. We take advantage of spatio-temporal granularity of alternative data and propose a Mixed-Frequency Aggregate Learning (MF-AGL) model that predicts economic indicators for the smaller areas in real-time. We apply the model for the real-world problem;prediction of the number of job applicants which is closely related to the unemployment rates. We find that the proposed model predicts (i) the regional heterogeneity of the labor market condition and (ii) the rapidly changing economic status. The model can be applied to various tasks, especially economic analysis. © 2022 IEEE.

15.

An Extensive Study on HAR Systems to Recognize Daily Activities using Deep Learning Approaches

Tippani, G.; Gampala, V..

1st IEEE International Conference on Automation, Computing and Renewable Systems, ICACRS 2022 ; : 736-742, 2022.

Article in English | Scopus | ID: covidwho-2284161

ABSTRACT

"Human Activity Recognition" (HAR) refers to the ability to recognise human physical movements using wearable devices or IoT sensors. In this epidemic, the majority of patients, particularly the elderly and those who are extremely ill, are placedin isolation units. Because of the quick development of COVID, it's tough for caregivers or others to keepan eye on them when they're in the same room. People are fitted with wearable gadgets to monitor them and take required precautions, and IoT-based video capturing equipment is installed in the isolation ward. The existing systems are designed to record and categorise six common actions, including walking, jogging, going upstairs, downstairs, sitting, and standing, using multi-class classification algorithms. This paper discussed the advantages and limitations associated with developing the model using deep learning approaches on the live streaming data through sensors using different publicly available datasets. © 2022 IEEE

16.

Semi-supervised COVID-19 volumetric pulmonary lesion estimation on CT images using probabilistic active contour and CNN segmentation.

Rodriguez-Obregon, Diomar Enrique; Mejia-Rodriguez, Aldo Rodrigo; Cendejas-Zaragoza, Leopoldo; Gutiérrez Mejía, Juan; Arce-Santana, Edgar Román; Charleston-Villalobos, Sonia; Aljama-Corrales, Tomas; Gabutti, Alejandro; Santos-Díaz, Alejandro.

Biomed Signal Process Control ; 85: 104905, 2023 Aug.

Article in English | MEDLINE | ID: covidwho-2278569

ABSTRACT

Purpose: A semi-supervised two-step methodology is proposed to obtain a volumetric estimation of COVID-19-related lesions on Computed Tomography (CT) images. Methods: First, damaged tissue was segmented from CT images using a probabilistic active contours approach. Second, lung parenchyma was extracted using a previously trained U-Net. Finally, volumetric estimation of COVID-19 lesions was calculated considering the lung parenchyma masks.Our approach was validated using a publicly available dataset containing 20 CT COVID-19 images previously labeled and manually segmented. Then, it was applied to 295 COVID-19 patients CT scans admitted to an intensive care unit. We compared the lesion estimation between deceased and survived patients for high and low-resolution images. Results: A comparable median Dice similarity coefficient of 0.66 for the 20 validation images was achieved. For the 295 images dataset, results show a significant difference in lesion percentages between deceased and survived patients, with a p-value of 9.1 × 10-4 in low-resolution and 5.1 × 10-5 in high-resolution images. Furthermore, the difference in lesion percentages between high and low-resolution images was 10 % on average. Conclusion: The proposed approach could help estimate the lesion size caused by COVID-19 in CT images and may be considered an alternative to getting a volumetric segmentation for this novel disease without the requirement of large amounts of COVID-19 labeled data to train an artificial intelligence algorithm. The low variation between the estimated percentage of lesions in high and low-resolution CT images suggests that the proposed approach is robust, and it may provide valuable information to differentiate between survived and deceased patients.

17.

Data augmentation based semi-supervised method to improve COVID-19 CT classification.

Chen, Xiangtao; Bai, Yuting; Wang, Peng; Luo, Jiawei.

Math Biosci Eng ; 20(4): 6838-6852, 2023 02 06.

Article in English | MEDLINE | ID: covidwho-2254646

ABSTRACT

The Coronavirus (COVID-19) outbreak of December 2019 has become a serious threat to people around the world, creating a health crisis that infected millions of lives, as well as destroying the global economy. Early detection and diagnosis are essential to prevent further transmission. The detection of COVID-19 computed tomography images is one of the important approaches to rapid diagnosis. Many different branches of deep learning methods have played an important role in this area, including transfer learning, contrastive learning, ensemble strategy, etc. However, these works require a large number of samples of expensive manual labels, so in order to save costs, scholars adopted semi-supervised learning that applies only a few labels to classify COVID-19 CT images. Nevertheless, the existing semi-supervised methods focus primarily on class imbalance and pseudo-label filtering rather than on pseudo-label generation. Accordingly, in this paper, we organized a semi-supervised classification framework based on data augmentation to classify the CT images of COVID-19. We revised the classic teacher-student framework and introduced the popular data augmentation method Mixup, which widened the distribution of high confidence to improve the accuracy of selected pseudo-labels and ultimately obtain a model with better performance. For the COVID-CT dataset, our method makes precision, F1 score, accuracy and specificity 21.04%, 12.95%, 17.13% and 38.29% higher than average values for other methods respectively, For the SARS-COV-2 dataset, these increases were 8.40%, 7.59%, 9.35% and 12.80% respectively. For the Harvard Dataverse dataset, growth was 17.64%, 18.89%, 19.81% and 20.20% respectively. The codes are available at https://github.com/YutingBai99/COVID-19-SSL.

Subject(s)

COVID-19 , Humans , COVID-19/diagnostic imaging , COVID-19/epidemiology , SARS-CoV-2 , Databases, Factual , Disease Outbreaks , Tomography, X-Ray Computed

18.

Random Forest in Whitelist-Based ATM Security

Maliszewski, M.; Boryczka, U..

Intelligent Information and Database Systems, Aciids 2022, Pt Ii ; 13758:292-301, 2022.

Article in English | Web of Science | ID: covidwho-2243050

ABSTRACT

Accelerated by the COVID-19 pandemic, the trend of highly-sophisticated logical attacks on Automated Teller Machines (ATMs) is ever-increasing nowadays. Due to the nature of attacks, it is common to use zero-day protection for the devices. The most secure solutions available are using whitelist-based policies, which are extremely hard to configure. This article presents the concept of a semi-supervised decision support system based on the Random forest algorithm for generating a whitelist-based security policy using the ATM usage data. The obtained results confirm that the Random forest algorithm is effective in such scenarios and can be used to increase the security of the ATMs.

19.

FaxMatch: Multi-Curriculum Pseudo-Labeling for semi-supervised medical image classification.

Peng, Zhen; Zhang, Dezhi; Tian, Shengwei; Wu, Weidong; Yu, Long; Zhou, Shaofeng; Huang, Shanhang.

Med Phys ; 50(5): 3210-3222, 2023 May.

Article in English | MEDLINE | ID: covidwho-2244151

ABSTRACT

BACKGROUND: Semi-supervised learning (SSL) can effectively use information from unlabeled data to improve model performance, which has great significance in medical imaging tasks. Pseudo-labeling is a classical SSL method that uses a model to predict unlabeled samples and selects the prediction with the highest confidence level as the pseudo-labels and then uses the generated pseudo-labels to train the model. Most of the current pseudo-label-based SSL algorithms use predefined fixed thresholds for all classes to select unlabeled data. PURPOSE: However, data imbalance is a common problem in medical image tasks, where the use of fixed threshold to generate pseudo-labels ignores different classes of learning status and learning difficulties. The aim of this study is to develop an algorithm to solve this problem. METHODS: In this work, we propose Multi-Curriculum Pseudo-Labeling (MCPL), which evaluates the learning status of the model for each class at each epoch and automatically adjusts the thresholds for each class. We apply MCPL to FixMatch and propose a new SSL framework for medical image classification, which we call the improved algorithm FaxMatch. To mitigate the impact of incorrect pseudo-labels on the model, we use label smoothing (LS) strategy to generate soft labels (SL) for pseudo-labels. RESULTS: We have conducted extensive experiments to evaluate our method on two public benchmark medical image classification datasets: the ISIC 2018 skin lesion analysis and COVID-CT datasets. Experimental results show that our method outperforms fully supervised baseline, which uses only labeled data to train the model. Moreover, our method also outperforms other state-of-the-art methods. CONCLUSIONS: We propose MCPL and construct a semi-supervised medical image classification framework to reduce the reliance of the model on a large number of labeled images and reduce the manual workload of labeling medical image data.

Subject(s)

COVID-19 , Humans , Curriculum , Algorithms , Benchmarking , Supervised Machine Learning

20.

Spanish Corpora of tweets about COVID-19 vaccination for automatic stance detection

Martínez, R. Y.; Blanco, G.; Lourenço, A..

Information Processing and Management ; 60(3), 2023.

Article in English | Scopus | ID: covidwho-2233026

ABSTRACT

The paper presents new annotated corpora for performing stance detection on Spanish Twitter data, most notably Health-related tweets. The objectives of this research are threefold: (1) to develop a manually annotated benchmark corpus for emotion recognition taking into account different variants of Spanish in social posts;(2) to evaluate the efficiency of semi-supervised models for extending such corpus with unlabelled posts;and (3) to describe such short text corpora via specialised topic modelling. A corpus of 2,801 tweets about COVID-19 vaccination was annotated by three native speakers to be in favour (904), against (674) or neither (1,223) with a 0.725 Fleiss' kappa score. Results show that the self-training method with SVM base estimator can alleviate annotation work while ensuring high model performance. The self-training model outperformed the other approaches and produced a corpus of 11,204 tweets with a macro averaged f1 score of 0.94. The combination of sentence-level deep learning embeddings and density-based clustering was applied to explore the contents of both corpora. Topic quality was measured in terms of the trustworthiness and the validation index. © 2023 The Author(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL